This notebook covers opening files, looking at pixels, and some simple image processing techniques.
We'll use the following sample image, stolen from the Internet. But you can use whatever image you like.
If you can't see an image above then you haven't got the full tutorial code from github. In the same directory as this notebook you should also have the following files:
1 Getting started notebook.ipynb
2 Fundamentals.ipynb
3 Image stats and image processing.ipynb
4 Features.ipynb
5 detecting faces and other things.ipynb
6 Moving away from the notebook.ipynb
cheat.py
common.py
common.pyc
edgedemo.png
haarcascade_frontalface_default.xml
LICENSE
noidea.jpg
play.py
README.md
start.py
test.jpg
video.py
video.pyc
If you haven't got them, make sure you've got the whole repo from [https://github.com/handee/opencv-gettingstarted]
First we need to import the relevant libraries: OpenCV itself, Numpy, and a couple of others. Common and Video are simple data handling and opening routines that you can find in the OpenCV Python Samples directory or from the github repo linked above. We'll start each notebook with the same includes - you don't need all of them every time (so this is bad form, really) but it's easier to just copy and paste.
In [1]:
# these imports let you use opencv
import cv2 #opencv itself
import common #some useful opencv functions
import video # some video stuff
import numpy as np # matrix manipulations
#the following are to do with this interactive notebook code
%matplotlib inline
from matplotlib import pyplot as plt # this lets you draw inline pictures in the notebooks
import pylab # this allows you to control figure size
pylab.rcParams['figure.figsize'] = (10.0, 8.0) # this controls figure size in the notebook
Now we can open an image:
In [2]:
input_image=cv2.imread('noidea.jpg')
We can find out various things about that image
In [3]:
print input_image.size
In [6]:
print input_image.shape
In [21]:
print input_image.dtype
gotcha that last one (datatype) is one of the tricky things about working in Python. As it's not strongly typed, Python will allow you to have arrays of different types but the same size, and some functions will return arrays of types that you probably don't want. Being able to check and inspect the datatype like this is very useful and is one of the things I often find myself doing in debugging.
In [22]:
plt.imshow(input_image)
Out[22]:
What this illustrates is something key about OpenCV: it doesn't store images in RGB format, but in BGR format.
In [23]:
# split channels
b,g,r=cv2.split(input_image)
# show one of the channels (this is red - see that the sky is kind of dark. try changing it to b)
plt.imshow(r, cmap='gray')
Out[23]:
In [24]:
merged=cv2.merge([r,g,b])
# merge takes an array of single channel matrices
plt.imshow(merged)
Out[24]:
OpenCV also has a function specifically for dealing with image colorspaces, so rather than split and merge channels by hand you can use this instead. It is usually marginally faster...
There are something like 250 color related flags in OpenCV for conversion and display. The ones you are most likely to use are COLOR_BGR2RGB for RGB conversion, COLOR_BGR2GRAY for conversion to greyscale, and COLOR_BGR2HSV for conversion to Hue,Saturation,Value colour space. [http://docs.opencv.org/trunk/de/d25/imgproc_color_conversions.html] has more information on how these colour conversions are done.
In [25]:
COLORflags = [flag for flag in dir(cv2) if flag.startswith('COLOR') ]
print len(COLORflags)
# print COLORflags
# if you want to see them all, rather than just a count
In [26]:
opencv_merged=cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)
plt.imshow(opencv_merged)
Out[26]:
Images in python OpenCV are numpy arrays. Numpy arrays are optimised for fast array operations and so there are usually fast methods for doing array calculations which don't actually involve writing all the detail yourself. So it's usually bad practice to access individual pixels, but you can.
In [27]:
pixel = input_image[100,100]
print pixel
In [28]:
input_image[100,100] = [0,0,0]
pixelnew = input_image[100,100]
print pixelnew
In [29]:
dogface = input_image[60:250, 70:350]
plt.imshow(dogface)
Out[29]:
In [30]:
fresh_image=cv2.imread('noidea.jpg') # it's either start with a fresh read of the image,
# or end up with dogfaces on dogfaces on dogfaces
# as you re-run parts of the notebook but not others...
fresh_image[200:200+dogface.shape[0], 200:200+dogface.shape[1]]=dogface
print dogface.shape[0]
print dogface.shape[1]
plt.imshow(fresh_image)
Out[30]:
In OpenCV python style, as I have mentioned, images are numpy arrays. There are some superb array manipulation in numpy tutorials out there: this is a great introduction if you've not done it before [http://www.scipy-lectures.org/intro/numpy/numpy.html#indexing-and-slicing]. The getting and setting of regions above uses slicing, though, and I'd like to finish this notebook with a little more detail on what is going on there.
In [31]:
freshim2 = cv2.imread("noidea.jpg")
crop = freshim2[100:400, 130:300]
plt.imshow(crop)
Out[31]:
The key thing to note here is that the slicing works like
[top_y:bottom_y, left_x:right_x]
This can also be thought of as
[y:y+height, x:x+width]
You can also use slicing to separate out channels. In this case you want
[y:y+height, x:x+width, channel]
where channel represents the colour you're interested in - this could be 0 = blue, 1 = green or 2=red if you're dealing with a default OpenCV image, but if you've got an image that has been converted it could be something else. Here's an example that converts to HSV then selects the S (Saturation) channel of the same crop above:
In [32]:
hsvim=cv2.cvtColor(freshim2,cv2.COLOR_BGR2HSV)
bcrop =hsvim[100:400, 100:300, 1]
plt.imshow(bcrop, cmap="gray")
Out[32]:
In [ ]: